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DETAILED ACTION 

1 . This action is responsive to communications: The Amendment filed 06/22/07. 

2. Claims 1-3, 6-12, 15-22 remain rejected under 35 U.S.C. 103(a) as being 
unpatentable over Russell-Falla et al (US-6,675,162 01/06/04) in view of Chakrabarti et 
al (US-6,389,436 05/14/02) in further view of Shmueli et al (US-6,442,555 08/27/02). 

3. Claim 4-5 and 13-14 remain rejected under 35 U.S.C. 103(a) as being 
unpatentable over Russell-Falla et al (US: 6,675,162 01/06/04) in view of Chakrabarti et 
al (US-6,389,436 05/14/02) in view of Shmueli et al (US-6,442,555 08/27/02). in farther 
view of Haug et al (US: 6,556,964 04/29/03). 

4. Claims 1-22 are pending in the case. Claims 1 and 10 are independent claims. 

Claim Rejections - 35 USC § 103 

5. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

6. Claims 1-3, 6-12, 15-22 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Russell-Falla et al (US-6,675,162 01/06/04) in view of Chakrabarti et al (US- 
6,389,436 05/14/02) in further view of Shmueli et al (US-6,442,555 08/27/02). 

-In regard to independent claims 1 and 10, Russell-Falla et al teach a method 
and apparatus for determining content type of a web page comprising: 

providing a predefined set of potential content types exclusive of indicating 
formal language of the content (categories of content)(column 2, lines 35-43); 
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for each potential content type (categories of content)(column 2, lines 35-43)(e.g. 
"pornographic", "racist", etc)(column 3, lines 39-43), preparing a distinguishing series of 
binary tests (column 2, lines 56-63; column 3, lines 39-57; column 4, lines 61-66)(i.e. 
testing each keyword or regular expression of the content against a database of keywords 
and regular expressions common to the content type for matching and weighting 
purposes), wherein at least one test determines whether a predefined piece of data or 
keyword appears in URLs (column 2, lines 5-9) of the subject Web page (column 2, lines 
56-63; column 3, lines 23-57, column 4, lines 61-66: i.e. testing all of a web pages textual 
content which as appreciated by one skilled in the art would include the text in the URLs 
of the selected web page); 

for each potential content type (categories of content)(column 2, lines 35-43)(e.g. 
"pornographic", "racist", etc)(column 3, lines 39-43), running the distinguishing series of 
tests (column 2, lines 56-63; column 3, lines 39-57; column 4, lines 61-66)(i.e. testing 
each keyword or regular expression against a database of keywords and regular 
expressions common to the content type for matching and weighting purposes) enabling 
quantitative evaluation of some contents of the selected web page being of the potential 
content type (column 2, lines 55-64); 

mathematically combining the test results (column 3, lines 54-57); and 

based on the results, assigning a probability (equivalent to the final rating of the 
page relative to the content category), for each potential content type, that shows the 
likelihood that some contents of that type exist on the selected web page exclusive of 
indicating the language in which the content was written (column 3, lines 2-6). 
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Russell-Falla et al do not teach wherein the distinguishing series of tests includes 
at least one or more non-binary tests. Chakrabarti et al teach a distinguishing series of 
tests to determine the potential content type of a web page (column 4, lines 5-15), 
wherein the distinguishing series of tests include non-binary tests (i.e. the test results in 
more than two possible outcomes)(column 6, lines 27-67; column 7, lines 1-4). It would 
have been obvious to one of ordinary skill in the art at the time of the invention for 
Russell-Falla et al to have run the additional non-binary tests as taught in Chakrabarti et 
al for determining the content type of web pages, because Chakrabarti et al teach that by 
utilizing in/out links of the web page with the hypertext classifier (Fig. 1 : 110) the 
accuracy of classification goes up over those tests utilizing only local text/terms of the 
document to be classified (column 7, lines 34-59). 

Russell-Falla et al and Chakrabarti et al do not specifically teach wherein the 
binary test and non-binary test further include a test for classifying the content type based 
on examining page format or style other than position of data or keyword in the subject 
web page. Shmueli et al teaches a method for classifying a new document as a particular 
type based on determining the format information within each portion of the document, 
wherein the format information includes font, font size, and justification (column 1, lines 
28-47: "font, font size, and justification"). It would have been obvious to one of ordinary 
skill in the art at the time of the invention for one of the distinguishing series of tests of 
Russell-Falla and Chakrabarti et al to have analyzed the page format or style of the 
document web page as shown in Shmueli et al, because Shmueli et al teach that by 
utilizing a more robust document decomposition that looked specifically at document 
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format, a document could be automatically recognized and classified (column 1, lines 28- 
47; column 2, lines 13-17). 

-In regard to dependent claims 2, 11, and 15, Russell-Falla et al further teach 
wherein the set of potential content types could include web page articles/news with 
information about people (e.g. pornography, racism, terrorism) and other content (column 
2, lines 10-23; column 3, lines 41-43). 

-In regard to dependent claims 3 and 12, Russell-Falla et al further teach 
producing a respective confidence level (equivalent to the rating of the page relative to 
the content category) for each potential content type when at least some of the web page 
content was of that type (columns 2 & 3, lines 54-67 & 1-6). 

-In regard to dependent claims 6 and 16, Russell-Falla et al further teach 
wherein the step of running the tests includes determining whether a predefined piece of 
data or keyword ("weighting list") appears in the web page (column 2, lines 56-63). 

-In regard to dependent claims 7 and 17, Russell-Falla et al further teach 
wherein the step of running the tests includes determining whether a predefined piece of 
data or keyword ("weighting list") appears in the web page (column 2, lines 56-63). 

-In regard to dependent claims 8, 9, and 18, Russell-Falla et al do not teach 
storing indications of the assigned probabilities (web page ratings) of each potential 
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content type cross referenced with each respective web page in a database. It would have 
been obvious to one of ordinary skill in the art at the time of the invention to have stored 
previously viewed web pages along with there respective ratings for content types local 
to the user, because it was well known in the art at the time of the invention that storing 
frequently view web pages with their ratings would significantly reduce the 
determination/processing time of the Russell-Falla et al system by eliminating undue 
identifying, analyzing, and calculating on identical web page requests. Thus a repeated 
request could render an appropriate web page more efficiently which would benefit 
Russell-Falla et al which teach that analyzing web pages could be difficult and time 
consuming (column 1, 38-43). 

-In regard to dependent claims 19 and 20, Russell-Falla et al teaches wherein 
one of tests was determining the number of phrases that contain the keyword (column 2, 
lines 56-63; column 3, lines 39-57; column 4, lines 61-67; column 5, lines 1-22: i.e. the 
number of keywords from the selected web page that have a corresponding entry in the 
pre-existing database are placed in a weighted list and then summed together to 
determine a rating to be compared to a given threshold). 

-In regard to dependent claims 21 and 22, Russell-Falla teaches in view of the 
Shmueli et al reference wherein one test included examining page format or style other 
than position of data. Because a test for examining syntax or grammar was listed in the 
alternative the limitations of these claims do not further limit their parent claims via the 
path selected by the Examiner and are thus not considered. 
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7. Claim 4-5 and 13-14 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Russell-Falla et al (US: 6,675,162 01/06/04) in view of Chakrabarti et al (US- 
6,389,436 05/14/02) in view of Shmueli et al (US-6,442,555 08/27/02). in further view of 
Haug et al (US: 6,556,964 04/29/03). 

-In regard to dependent claims 4 and 13, Russell-Falla et al further teach 
wherein the test results utilize a neural network (column 4, lines 1-5). Russell-Falla et al 
do not teach wherein the combining of the test results includes using a Bayesian network. 
Haug et al teach wherein the application of a Bayesian network for statistical pattern 
recognition provides improved system performance with additional training of the 
network (column 3, lines 8-16). It would have been obvious to one of ordinary skill in 
the art at the time of the invention, for the invention of Russell-Falla et al to have 
employed a Bayesian network as shown in Haug et al, to achieve the above mentioned 
improved system performance, because Russell-Falla et al do provide the needed training 
of the network (column 3, lines 58-67) which would be needed to increase the statistical 
recognition needed to support the Bayesian network. 

-In regard to dependent claims 5 and 14, Russell-Falla et al further teach the 
step of training the neural network using a training set of web pages with respective 
known content types and collecting the statistics on the test results of the training web 
pages (column 3, lines 58-67). 



Response to Arguments 
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8. Applicant's arguments filed 06/22/07 have been fully considered but they are not 
persuasive. 

-In response to applicant's argument that there is no suggestion to combine the 
references (i.e. The Russell-Falla and Shmueli references), the examiner recognizes that 
obviousness can only be established by combining or modifying the teachings of the prior 
art to produce the claimed invention where there is some teaching, suggestion, or 
motivation to do so found either in the references themselves or in the knowledge 
generally available to one of ordinary skill in the art. See In re Fine, 837 F.2d 1071, 5 
USPQ2d 1596 (Fed. Cir. 1988) and In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. 
Cir. 1992). In this case, Russell-Falla teaches a method/system for categorizing a 
selected web page based on one or more selected categories of content. While the 
preferred example embodiment of Russell-Falla was concerned with blocking web pages 
that were categorized as being unsuitable or potentially harmful (e.g. pornographic, 
racist, etc), Russell-Falla additionally taught alternative embodiments for categorizing 
web pages for any other reason that one might want to identify particular web pages 
based on their content (column 2, lines 44-49; column 4, lines 35-44). As discussed in 
the rejection, the result of Russell-Falla' s plurality of tests is a binary test that categorizes 
a given web page in relation to a selected content category. 

Russell-Falla does not teach wherein one of the plurality of tests was examining 
the web page for format or style other than the position of data or a keyword in the web 
page. Shmeuli et al cures this deficiency by teaching that it was notoriously well known 
in the art at the time of the invention for document classifiers to recognize both size, 
location, and organization of distinct portions of an electronic document as well as 
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determining format information of each portion of the electronic document to include, for 
example, font, font size, and justification (column 1, lines 34-47). Thus Shmeuli teaches 
another well known document classification method, which instead of analyzing the 
natural language text of a given document, analyzes a documents format and style to 
determine a documents content type. Therefore it would have been obvious to one of 
ordinary skill in the art at the time of the invention for one of the distinguishing series of 
tests of Russell-Falla and Chakrabarti et al to have analyzed the page format or style of 
the document web page as shown in Shmueli et al, because Shmueli et al teach that by 
utilizing a more robust document decomposition that looked specifically at document 
format, a document could be automatically recognized and classified (column 1, lines 28- 
47; column 2, lines 13-17). 

The Applicant's arguments (Page 12) are unproven. Clearly one of ordinary skill 
in the art at the time of the invention would have appreciated the use and cited benefits of 
the plurality of well known techniques for classifying an electronic document as 
disclosed in the Russell-Falla, Chakrabarti et al, and Shmeuli references. Furthermore, 
Applicant's comments appear to contradict Applicant's own claimed invention that could 
require a web document's format or style in order to determine the content type of a 
subject web page. If as Applicant suggests, that an electronic document's format or style 
"cannot provide any insight on the content type and hence category or classification of a 
subject Web Page," then serious doubts are raised with regards to the independent claims. 

The arguments with regards to the Chakrabarti reference are similar to those as 
discussed above and are considered not persuasive based on the same rational. 
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Conclusion 

9. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of 
time policy as set forth in 37 CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of the 
advisory action. In no event, however, will the statutory period for reply expire later than 
SIX MONTHS from the mailing date of this final action. 

10. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Adam L. Basehoar whose telephone number is (571)-272- 
4121 . The examiner can normally be reached on M-F: 7:00am - 4:00pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Steve Hong can be reached on (571) 272-4124. The fax phone number for 
the organization where this application or proceeding is assigned is 571-273-8300. 



Application/Control Number: 09/768,869 



Page 1 1 



Art Unit: 2178 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. Status 
information for unpublished applications is available through Private PAIR only. For 
more information about the PAIR system, see http://pair-direct.uspto.gov. Should you 
have questions on access to the Private PAIR system, contact the Electronic Business 
Center (EBC) at 866-2 1 7-9 1 97 (toll-free). 
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